Less is more: towards an optimal universal description of protein folds

نویسندگان

  • Joseph D. Szustakowski
  • Simon Kasif
  • Zhiping Weng
چکیده

MOTIVATION Identification and characterization of protein structure regularities can reveal the mechanisms governing protein structure, function and evolution. Here we focus on an intermediate level of regularity. We have developed automated methods to systematically construct a dictionary of supersecondary structures that can be used as 'protein parts' to describe fold-sized structures. RESULTS The dictionary was constructed by aligning representative structures of all known folds, clustering similar substructures and selecting the most descriptive substructures in a minimum description length fashion. We show that the dictionary is compact and descriptive, capable of describing a substantial fraction of all known protein folds. We performed simulations using independent sets of training and testing folds. Dictionaries generated using the training set had high coverage over the folds in the testing set, suggesting that dictionary entries reflect general features of protein structures and should be capable of describing novel protein folds.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of immunogenicity of recombinant influenza nucleoprotein (NP) for universal vaccine

Background: Influenza vaccines based on conserved proteins are being developed persistently. The conserved protein vaccines based on Nucleoprotein (NP) are highly protected vaccines against influenza viruses that can be used as a Universal vaccine. Aluminum hydroxide (Alum) is the most common adjuvant used in vaccine formulation to improve immunization by altering the epitopes’ folds. However, ...

متن کامل

Priority Setting for Universal Health Coverage: We Need Evidence-Informed Deliberative Processes, Not Just More Evidence on Cost-Effectiveness

Priority setting of health interventions is generally considered as a valuable approach to support low- and middle-income countries (LMICs) in their strive for universal health coverage (UHC). However, present initiatives on priority setting are mainly geared towards the development of more cost-effectiveness information, and this evidence does not sufficiently support countries to make optimal...

متن کامل

An annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies

A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...

متن کامل

Implementation Research: An Efficient and Effective Tool to Accelerate Universal Health Coverage

Success in the implementation of evidence-based interventions (EBIs) in different settings has had variable success. Implementation research offers the approach needed to understand the variability of health outcomes from implementation strategies in different settings and why interventions were successful in some countries and failed in others. When mastered and embedd...

متن کامل

A twist on folding: Predicting optimal sequences and optimal folds of simple protein models with the hidden-force algorithm

We propose a new way of looking at global optimization of off-lattice protein models. We present a dual optimization concept of predicting optimal sequences as well as optimal folds. We validate the utility of the recently introduced hidden-force Monte Carlo optimization algorithm by finding significantly lower energy folds for minimalist protein models than previously reported. Further, we als...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 21 Suppl 2  شماره 

صفحات  -

تاریخ انتشار 2005